Overview

Brought to you by YData

Dataset statistics

Number of variables6
Number of observations14227140
Missing cells9
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.3 GiB
Average record size in memory399.9 B

Variable types

Text6

Alerts

nconst has unique values Unique

Reproduction

Analysis started2025-03-04 03:58:42.429407
Analysis finished2025-03-04 04:05:09.957605
Duration6 minutes and 27.53 seconds
Software versionydata-profiling vv4.12.2
Download configurationconfig.json

Variables

nconst
Text

Unique 

Distinct14227140
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size901.0 MiB
2025-03-03T23:05:19.380455image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length10
Median length9
Mean length9.408556
Min length9

Characters and Unicode

Total characters133856844
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14227140 ?
Unique (%)100.0%

Sample

1st rownm0000001
2nd rownm0000002
3rd rownm0000003
4th rownm0000004
5th rownm0000005
ValueCountFrequency (%)
nm0000012 1
 
< 0.1%
nm9993719 1
 
< 0.1%
nm0000001 1
 
< 0.1%
nm0000002 1
 
< 0.1%
nm9993686 1
 
< 0.1%
nm9993687 1
 
< 0.1%
nm9993688 1
 
< 0.1%
nm9993689 1
 
< 0.1%
nm9993690 1
 
< 0.1%
nm9993691 1
 
< 0.1%
Other values (14227130) 14227130
> 99.9%
2025-03-03T23:05:26.622862image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 16110889
12.0%
n 14227140
10.6%
m 14227140
10.6%
0 10346563
7.7%
3 10265509
7.7%
2 10260491
7.7%
4 10187127
7.6%
5 10155167
7.6%
6 10087140
7.5%
7 9360859
7.0%
Other values (2) 18628819
13.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 133856844
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 16110889
12.0%
n 14227140
10.6%
m 14227140
10.6%
0 10346563
7.7%
3 10265509
7.7%
2 10260491
7.7%
4 10187127
7.6%
5 10155167
7.6%
6 10087140
7.5%
7 9360859
7.0%
Other values (2) 18628819
13.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 133856844
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 16110889
12.0%
n 14227140
10.6%
m 14227140
10.6%
0 10346563
7.7%
3 10265509
7.7%
2 10260491
7.7%
4 10187127
7.6%
5 10155167
7.6%
6 10087140
7.5%
7 9360859
7.0%
Other values (2) 18628819
13.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 133856844
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 16110889
12.0%
n 14227140
10.6%
m 14227140
10.6%
0 10346563
7.7%
3 10265509
7.7%
2 10260491
7.7%
4 10187127
7.6%
5 10155167
7.6%
6 10087140
7.5%
7 9360859
7.0%
Other values (2) 18628819
13.9%
Distinct10907648
Distinct (%)76.7%
Missing9
Missing (%)< 0.1%
Memory size990.9 MiB
2025-03-03T23:05:28.446048image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length105
Median length78
Mean length13.510672
Min length1

Characters and Unicode

Total characters192218100
Distinct characters208
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9798529 ?
Unique (%)68.9%

Sample

1st rowFred Astaire
2nd rowLauren Bacall
3rd rowBrigitte Bardot
4th rowJohn Belushi
5th rowIngmar Bergman
ValueCountFrequency (%)
david 134327
 
0.5%
john 126293
 
0.4%
michael 125434
 
0.4%
james 87749
 
0.3%
de 81558
 
0.3%
paul 70434
 
0.2%
robert 69162
 
0.2%
daniel 68951
 
0.2%
chris 68371
 
0.2%
thomas 62607
 
0.2%
Other values (2255244) 28724169
97.0%
2025-03-03T23:05:30.159605image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 19993595
 
10.4%
e 16218300
 
8.4%
15391924
 
8.0%
n 13140925
 
6.8%
i 13045417
 
6.8%
r 12022862
 
6.3%
o 10491560
 
5.5%
l 8834260
 
4.6%
s 6982301
 
3.6%
t 6211561
 
3.2%
Other values (198) 69885395
36.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 192218100
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 19993595
 
10.4%
e 16218300
 
8.4%
15391924
 
8.0%
n 13140925
 
6.8%
i 13045417
 
6.8%
r 12022862
 
6.3%
o 10491560
 
5.5%
l 8834260
 
4.6%
s 6982301
 
3.6%
t 6211561
 
3.2%
Other values (198) 69885395
36.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 192218100
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 19993595
 
10.4%
e 16218300
 
8.4%
15391924
 
8.0%
n 13140925
 
6.8%
i 13045417
 
6.8%
r 12022862
 
6.3%
o 10491560
 
5.5%
l 8834260
 
4.6%
s 6982301
 
3.6%
t 6211561
 
3.2%
Other values (198) 69885395
36.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 192218100
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 19993595
 
10.4%
e 16218300
 
8.4%
15391924
 
8.0%
n 13140925
 
6.8%
i 13045417
 
6.8%
r 12022862
 
6.3%
o 10491560
 
5.5%
l 8834260
 
4.6%
s 6982301
 
3.6%
t 6211561
 
3.2%
Other values (198) 69885395
36.4%
Distinct559
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size801.7 MiB
2025-03-03T23:05:30.361567image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.0900079
Min length1

Characters and Unicode

Total characters29734835
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique172 ?
Unique (%)< 0.1%

Sample

1st row1899
2nd row1924
3rd row1934
4th row1949
5th row1918
ValueCountFrequency (%)
n 13586838
95.5%
1980 10261
 
0.1%
1981 9967
 
0.1%
1979 9878
 
0.1%
1982 9841
 
0.1%
1978 9740
 
0.1%
1983 9473
 
0.1%
1984 9450
 
0.1%
1977 9170
 
0.1%
1985 9111
 
0.1%
Other values (549) 553411
 
3.9%
2025-03-03T23:05:30.625587image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
\ 13586838
45.7%
N 13586838
45.7%
1 719330
 
2.4%
9 718202
 
2.4%
8 209052
 
0.7%
7 159796
 
0.5%
6 137945
 
0.5%
2 128926
 
0.4%
4 125781
 
0.4%
5 124573
 
0.4%
Other values (2) 237554
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 29734835
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
\ 13586838
45.7%
N 13586838
45.7%
1 719330
 
2.4%
9 718202
 
2.4%
8 209052
 
0.7%
7 159796
 
0.5%
6 137945
 
0.5%
2 128926
 
0.4%
4 125781
 
0.4%
5 124573
 
0.4%
Other values (2) 237554
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 29734835
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
\ 13586838
45.7%
N 13586838
45.7%
1 719330
 
2.4%
9 718202
 
2.4%
8 209052
 
0.7%
7 159796
 
0.5%
6 137945
 
0.5%
2 128926
 
0.4%
4 125781
 
0.4%
5 124573
 
0.4%
Other values (2) 237554
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 29734835
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
\ 13586838
45.7%
N 13586838
45.7%
1 719330
 
2.4%
9 718202
 
2.4%
8 209052
 
0.7%
7 159796
 
0.5%
6 137945
 
0.5%
2 128926
 
0.4%
4 125781
 
0.4%
5 124573
 
0.4%
Other values (2) 237554
 
0.8%
Distinct502
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size801.0 MiB
2025-03-03T23:05:30.833458image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.0338514
Min length2

Characters and Unicode

Total characters28935889
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique175 ?
Unique (%)< 0.1%

Sample

1st row1987
2nd row2014
3rd row\N
4th row1982
5th row2007
ValueCountFrequency (%)
n 13986314
98.3%
2021 7607
 
0.1%
2022 7246
 
0.1%
2020 7223
 
0.1%
2023 7004
 
< 0.1%
2024 6284
 
< 0.1%
2019 6100
 
< 0.1%
2018 5866
 
< 0.1%
2016 5760
 
< 0.1%
2017 5741
 
< 0.1%
Other values (492) 181995
 
1.3%
2025-03-03T23:05:31.093071image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
\ 13986314
48.3%
N 13986314
48.3%
2 195513
 
0.7%
0 195308
 
0.7%
1 192239
 
0.7%
9 161615
 
0.6%
8 46958
 
0.2%
7 40013
 
0.1%
6 35292
 
0.1%
4 33918
 
0.1%
Other values (2) 62405
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 28935889
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
\ 13986314
48.3%
N 13986314
48.3%
2 195513
 
0.7%
0 195308
 
0.7%
1 192239
 
0.7%
9 161615
 
0.6%
8 46958
 
0.2%
7 40013
 
0.1%
6 35292
 
0.1%
4 33918
 
0.1%
Other values (2) 62405
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 28935889
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
\ 13986314
48.3%
N 13986314
48.3%
2 195513
 
0.7%
0 195308
 
0.7%
1 192239
 
0.7%
9 161615
 
0.6%
8 46958
 
0.2%
7 40013
 
0.1%
6 35292
 
0.1%
4 33918
 
0.1%
Other values (2) 62405
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 28935889
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
\ 13986314
48.3%
N 13986314
48.3%
2 195513
 
0.7%
0 195308
 
0.7%
1 192239
 
0.7%
9 161615
 
0.6%
8 46958
 
0.2%
7 40013
 
0.1%
6 35292
 
0.1%
4 33918
 
0.1%
Other values (2) 62405
 
0.2%
Distinct23206
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size938.9 MiB
2025-03-03T23:05:31.242362image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length67
Median length64
Mean length12.197141
Min length2

Characters and Unicode

Total characters173530437
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5599 ?
Unique (%)< 0.1%

Sample

1st rowactor,miscellaneous,producer
2nd rowactress,soundtrack,archive_footage
3rd rowactress,music_department,producer
4th rowactor,writer,music_department
5th rowwriter,director,actor
ValueCountFrequency (%)
n 2783996
19.6%
actor 2516928
17.7%
actress 1615254
 
11.4%
miscellaneous 822004
 
5.8%
producer 487849
 
3.4%
camera_department 439594
 
3.1%
art_department 265486
 
1.9%
writer 230821
 
1.6%
sound_department 222345
 
1.6%
composer 174667
 
1.2%
Other values (23196) 4668196
32.8%
2025-03-03T23:05:31.446505image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 19906824
11.5%
e 19883937
11.5%
t 18424334
10.6%
a 16519238
9.5%
c 12799795
 
7.4%
o 11508296
 
6.6%
s 10609925
 
6.1%
n 7763103
 
4.5%
m 7670612
 
4.4%
i 7291113
 
4.2%
Other values (16) 41153260
23.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 173530437
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
r 19906824
11.5%
e 19883937
11.5%
t 18424334
10.6%
a 16519238
9.5%
c 12799795
 
7.4%
o 11508296
 
6.6%
s 10609925
 
6.1%
n 7763103
 
4.5%
m 7670612
 
4.4%
i 7291113
 
4.2%
Other values (16) 41153260
23.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 173530437
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
r 19906824
11.5%
e 19883937
11.5%
t 18424334
10.6%
a 16519238
9.5%
c 12799795
 
7.4%
o 11508296
 
6.6%
s 10609925
 
6.1%
n 7763103
 
4.5%
m 7670612
 
4.4%
i 7291113
 
4.2%
Other values (16) 41153260
23.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 173530437
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
r 19906824
11.5%
e 19883937
11.5%
t 18424334
10.6%
a 16519238
9.5%
c 12799795
 
7.4%
o 11508296
 
6.6%
s 10609925
 
6.1%
n 7763103
 
4.5%
m 7670612
 
4.4%
i 7291113
 
4.2%
Other values (16) 41153260
23.7%
Distinct5914773
Distinct (%)41.6%
Missing0
Missing (%)0.0%
Memory size992.8 MiB
2025-03-03T23:05:34.560946image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length43
Median length42
Mean length16.170076
Min length2

Characters and Unicode

Total characters230053935
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4889031 ?
Unique (%)34.4%

Sample

1st rowtt0072308,tt0050419,tt0027125,tt0031983
2nd rowtt0037382,tt0075213,tt0117057,tt0038355
3rd rowtt0057345,tt0049189,tt0056404,tt0054452
4th rowtt0072562,tt0077975,tt0080455,tt0078723
5th rowtt0050986,tt0069467,tt0050976,tt0083922
ValueCountFrequency (%)
n 1620890
 
11.4%
tt0123338 8258
 
0.1%
tt22014400 7508
 
0.1%
tt6168110 6382
 
< 0.1%
tt0441074 4882
 
< 0.1%
tt0072584 4305
 
< 0.1%
tt0159881 4068
 
< 0.1%
tt11874658 3905
 
< 0.1%
tt0479832 3898
 
< 0.1%
tt4202558 3624
 
< 0.1%
Other values (5914763) 12559420
88.3%
2025-03-03T23:05:37.613150image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 46537780
20.2%
0 22939005
10.0%
1 21137684
9.2%
2 19761831
8.6%
4 17136694
 
7.4%
3 16568506
 
7.2%
8 15932856
 
6.9%
6 15672449
 
6.8%
5 13793179
 
6.0%
7 13492850
 
5.9%
Other values (4) 27081101
11.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 230053935
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t 46537780
20.2%
0 22939005
10.0%
1 21137684
9.2%
2 19761831
8.6%
4 17136694
 
7.4%
3 16568506
 
7.2%
8 15932856
 
6.9%
6 15672449
 
6.8%
5 13793179
 
6.0%
7 13492850
 
5.9%
Other values (4) 27081101
11.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 230053935
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t 46537780
20.2%
0 22939005
10.0%
1 21137684
9.2%
2 19761831
8.6%
4 17136694
 
7.4%
3 16568506
 
7.2%
8 15932856
 
6.9%
6 15672449
 
6.8%
5 13793179
 
6.0%
7 13492850
 
5.9%
Other values (4) 27081101
11.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 230053935
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t 46537780
20.2%
0 22939005
10.0%
1 21137684
9.2%
2 19761831
8.6%
4 17136694
 
7.4%
3 16568506
 
7.2%
8 15932856
 
6.9%
6 15672449
 
6.8%
5 13793179
 
6.0%
7 13492850
 
5.9%
Other values (4) 27081101
11.8%

Missing values

2025-03-03T23:04:26.979151image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/
A simple visualization of nullity by column.
2025-03-03T23:04:32.926979image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

nconstprimaryNamebirthYeardeathYearprimaryProfessionknownForTitles
0nm0000001Fred Astaire18991987actor,miscellaneous,producertt0072308,tt0050419,tt0027125,tt0031983
1nm0000002Lauren Bacall19242014actress,soundtrack,archive_footagett0037382,tt0075213,tt0117057,tt0038355
2nm0000003Brigitte Bardot1934\Nactress,music_department,producertt0057345,tt0049189,tt0056404,tt0054452
3nm0000004John Belushi19491982actor,writer,music_departmenttt0072562,tt0077975,tt0080455,tt0078723
4nm0000005Ingmar Bergman19182007writer,director,actortt0050986,tt0069467,tt0050976,tt0083922
5nm0000006Ingrid Bergman19151982actress,producer,soundtracktt0034583,tt0038109,tt0036855,tt0038787
6nm0000007Humphrey Bogart18991957actor,producer,miscellaneoustt0034583,tt0043265,tt0033870,tt0037382
7nm0000008Marlon Brando19242004actor,director,writertt0078788,tt0068646,tt0047296,tt0070849
8nm0000009Richard Burton19251984actor,producer,directortt0061184,tt0087803,tt0059749,tt0057877
9nm0000010James Cagney18991986actor,director,producertt0029870,tt0031867,tt0042041,tt0034236
nconstprimaryNamebirthYeardeathYearprimaryProfessionknownForTitles
14227130nm9993709Lu Bevins\N\Nproducer,director,writertt17717854,tt11772904,tt11772812,tt11697102
14227131nm9993710Nestor Rudnytskyy\N\N\N\N
14227132nm9993711David Gluzman\N\N\N\N
14227133nm9993712Corny O'Connell\N\N\N\N
14227134nm9993713Sambit Mishra\N\Nwriter,producertt20319332,tt27191658,tt10709066,tt15134202
14227135nm9993714Romeo del Rosario\N\Nanimation_department,art_departmenttt11657662,tt14069590,tt2455546
14227136nm9993716Essias Loberg\N\N\N\N
14227137nm9993717Harikrishnan Rajan\N\Ncinematographertt8736744
14227138nm9993718Aayush Nair\N\Ncinematographertt8736744
14227139nm9993719Andre Hill\N\N\N\N